In general, the advantage for tree-based methods is due to its simple and clear interpretation, although the performance sometimes might not be as good as some of the models we discussed before.
The fundamental model for regression trees is a binary splitting
tree, as illustrated in the following picture:
We process the regression tree in two steps:
Divide the set of all possible values of \(X_1,X_2,\dots,X_n\) into non-overlapping boxes (binary splitting) \(R_1,\dots,R_j\)
For every observation that falls into the region \(R_j\), we consider \(\sum_{i \in R_j} (y_i-\hat{y}_i)^2\), where \(\hat{y}_i\), the estimated value for the region \(R_j\) is the average of the point in the sampling.
We split in the way that \(\sum_{j=1}^J\sum_{i \in R_j} (y_i-\hat{y}_i)^2\)
How to pick the proper splitting value and the predictor? It’s computationally formidable if we take every subtree into consideration. Here we use the greedy approach: each time when we split the region, we only consider the current besting splitting. More specifically, if We are at the region \(R_j\) and consider splitting it into \(R_1\) and \(R_2\). We consider each possible choice of \(X_j\) and \(s \in \mathbb{R}\) and let \(R_1=\{X|X_j<s\}\) and,\(R_2=\{X|X_j \geq s\}\). We are trying to find the best combination of (j,s) so that \(\sum_{i:x_i \in R_1(j,s)} (y_i-\hat{y}_i)^2+\sum_{i:x_i \in R_2(j,s)} (y_i-\hat{y}_i)^2\) is minimized.
Once we divide \(R_j\) into \(R_1\) and \(R_2\), we again use the same algorithm to further divide \(R_1\) and \(R_2\) recursively.
As we said, one advantage of using tree based method is because it’s easy to interpret. However, if have too complicated regions divided, it’s still not practical for us to interpret the model. Thus this reminds us of “the trade-off between interpretation and complexity”.
What if we further divide the tree only if thid further division will lead to a reduction in MSE above a (high threshold)? One limitation of this approcah is that we don’t know if our current seemingly meaningless division will lead to a subsequent meaningful division. Thus, what we do is not to grow a small tree, but prune a big tree into a small tree.
We consider to minimize: \(\sum_{m=1}^{|T|} \sum_{x_i \in R_m} (y_i-\hat{y}_{R_m})^2+\alpha|T|\) Here \(T\) represents the terminal nodes, which are different regions splitted by the tree, so |T| is the total number of areas. \(\alpha\) here is the tuning parameters, which control the balance between the goodness of fit and the complexity of the model. This expression is called “Cost Complexity Pruning”.
However, what’s the best \(\alpha\)? Like what we did before, the tuning parameter is always selected by the Cross-Validation Approach.
In fact, we can view regression tree as a special linear model: \(f(X)=\sum_{m=1}^M c_mI_{X \in R_m}\), but with a different fitting method (not minimizing least squres),
#!!Almost all the methods related to trees are in library(tree)
library(tree)
library(ISLR2)
attach(Boston)
set.seed(1)
train<-sample(1:nrow(Boston),nrow(Boston)/2)
#tree( ) perform regression/classification tree method, which is quite similar to lm()
tree.boston<-tree(medv~.,Boston,subset=train)
summary(tree.boston)
##
## Regression tree:
## tree(formula = medv ~ ., data = Boston, subset = train)
## Variables actually used in tree construction:
## [1] "rm" "lstat" "crim" "age"
## Number of terminal nodes: 7
## Residual mean deviance: 10.38 = 2555 / 246
## Distribution of residuals:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -10.1800 -1.7770 -0.1775 0.0000 1.9230 16.5800
plot(tree.boston)
text(tree.boston,pretty=0)
cv.boston<-cv.tree(tree.boston)
plot(cv.boston$size,cv.boston$dev,type="b")
#prune the tree to be 5-nodes
prune.boston<-prune.tree(tree.boston,best=5)
plot(tree.boston)
text(tree.boston,pretty=0)
yhat<-predict(tree.boston,newdata=Boston[-train,])
boston.test<-Boston[-train,"medv"]
plot(yhat,boston.test)
#MSE of the pruned tree
mean((yhat-boston.test)^2)
## [1] 35.28688
Similar to the case of linear regressions, classification trees are similar to regression trees except that it’s dealing with the problem of qualitative responses. To measure the performance of the tree at region \(R_m\), we use the Classification Error Rate: \(E=1-\max_{k} \hat{p}_{mk}\). However, in problems related to classifications, Classification Purity is a better measure than Classification Error Rate. There are two indices related to model purity:
Gini Index: \(G=\sum_{k=1}^K \hat{p}_{mk}(1-\hat{p}_{mk})\)
Entrophy: \(D=-\sum_{k=1}^K \hat{p}_{mk} \log(\hat{p}_{mk})\)
In fact, these two indices providing numerically similar results.
attach(Carseats)
#Use ifelse( ) function to create a variable called High, which creates a variable taking on value NO if Sales <= 8. Ifelse( ) is very useful in the case of classification
High<-factor(ifelse(Sales<=8,"No","Yes"))
#data.frame( ) to merge the data
Carseats<-data.frame(Carseats,High)
tree.carseats<-tree(High~.-Sales,Carseats)
summary(tree.carseats)
##
## Classification tree:
## tree(formula = High ~ . - Sales, data = Carseats)
## Variables actually used in tree construction:
## [1] "ShelveLoc" "Price" "Income" "CompPrice" "Population"
## [6] "Advertising" "Age" "US"
## Number of terminal nodes: 27
## Residual mean deviance: 0.4575 = 170.7 / 373
## Misclassification error rate: 0.09 = 36 / 400
In the summary above, the deviance is given by \(-2\sum_m\sum_kn_{mk}\log(\hat{p_{mk}})\), which is closely related to entrophy.The residual mean deviance is given by the deviance divided by \(n-|T_0|\), where \(T_0\) denotes the set of all terminal nodes. In our case, it’s 400-27=373.
plot(tree.carseats)
#text() displays the node labels, while pretty=0 means the R including the category names for any qualitive variables, instead of displaying a letter for each category
text(tree.carseats,pretty=0)
#directly typing tree.carseats shows the text version of the tree structure.
tree.carseats
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 400 541.500 No ( 0.59000 0.41000 )
## 2) ShelveLoc: Bad,Medium 315 390.600 No ( 0.68889 0.31111 )
## 4) Price < 92.5 46 56.530 Yes ( 0.30435 0.69565 )
## 8) Income < 57 10 12.220 No ( 0.70000 0.30000 )
## 16) CompPrice < 110.5 5 0.000 No ( 1.00000 0.00000 ) *
## 17) CompPrice > 110.5 5 6.730 Yes ( 0.40000 0.60000 ) *
## 9) Income > 57 36 35.470 Yes ( 0.19444 0.80556 )
## 18) Population < 207.5 16 21.170 Yes ( 0.37500 0.62500 ) *
## 19) Population > 207.5 20 7.941 Yes ( 0.05000 0.95000 ) *
## 5) Price > 92.5 269 299.800 No ( 0.75465 0.24535 )
## 10) Advertising < 13.5 224 213.200 No ( 0.81696 0.18304 )
## 20) CompPrice < 124.5 96 44.890 No ( 0.93750 0.06250 )
## 40) Price < 106.5 38 33.150 No ( 0.84211 0.15789 )
## 80) Population < 177 12 16.300 No ( 0.58333 0.41667 )
## 160) Income < 60.5 6 0.000 No ( 1.00000 0.00000 ) *
## 161) Income > 60.5 6 5.407 Yes ( 0.16667 0.83333 ) *
## 81) Population > 177 26 8.477 No ( 0.96154 0.03846 ) *
## 41) Price > 106.5 58 0.000 No ( 1.00000 0.00000 ) *
## 21) CompPrice > 124.5 128 150.200 No ( 0.72656 0.27344 )
## 42) Price < 122.5 51 70.680 Yes ( 0.49020 0.50980 )
## 84) ShelveLoc: Bad 11 6.702 No ( 0.90909 0.09091 ) *
## 85) ShelveLoc: Medium 40 52.930 Yes ( 0.37500 0.62500 )
## 170) Price < 109.5 16 7.481 Yes ( 0.06250 0.93750 ) *
## 171) Price > 109.5 24 32.600 No ( 0.58333 0.41667 )
## 342) Age < 49.5 13 16.050 Yes ( 0.30769 0.69231 ) *
## 343) Age > 49.5 11 6.702 No ( 0.90909 0.09091 ) *
## 43) Price > 122.5 77 55.540 No ( 0.88312 0.11688 )
## 86) CompPrice < 147.5 58 17.400 No ( 0.96552 0.03448 ) *
## 87) CompPrice > 147.5 19 25.010 No ( 0.63158 0.36842 )
## 174) Price < 147 12 16.300 Yes ( 0.41667 0.58333 )
## 348) CompPrice < 152.5 7 5.742 Yes ( 0.14286 0.85714 ) *
## 349) CompPrice > 152.5 5 5.004 No ( 0.80000 0.20000 ) *
## 175) Price > 147 7 0.000 No ( 1.00000 0.00000 ) *
## 11) Advertising > 13.5 45 61.830 Yes ( 0.44444 0.55556 )
## 22) Age < 54.5 25 25.020 Yes ( 0.20000 0.80000 )
## 44) CompPrice < 130.5 14 18.250 Yes ( 0.35714 0.64286 )
## 88) Income < 100 9 12.370 No ( 0.55556 0.44444 ) *
## 89) Income > 100 5 0.000 Yes ( 0.00000 1.00000 ) *
## 45) CompPrice > 130.5 11 0.000 Yes ( 0.00000 1.00000 ) *
## 23) Age > 54.5 20 22.490 No ( 0.75000 0.25000 )
## 46) CompPrice < 122.5 10 0.000 No ( 1.00000 0.00000 ) *
## 47) CompPrice > 122.5 10 13.860 No ( 0.50000 0.50000 )
## 94) Price < 125 5 0.000 Yes ( 0.00000 1.00000 ) *
## 95) Price > 125 5 0.000 No ( 1.00000 0.00000 ) *
## 3) ShelveLoc: Good 85 90.330 Yes ( 0.22353 0.77647 )
## 6) Price < 135 68 49.260 Yes ( 0.11765 0.88235 )
## 12) US: No 17 22.070 Yes ( 0.35294 0.64706 )
## 24) Price < 109 8 0.000 Yes ( 0.00000 1.00000 ) *
## 25) Price > 109 9 11.460 No ( 0.66667 0.33333 ) *
## 13) US: Yes 51 16.880 Yes ( 0.03922 0.96078 ) *
## 7) Price > 135 17 22.070 No ( 0.64706 0.35294 )
## 14) Income < 46 6 0.000 No ( 1.00000 0.00000 ) *
## 15) Income > 46 11 15.160 Yes ( 0.45455 0.54545 ) *
Next we see the performance of classification tree:
set.seed(2)
train<-sample(1:nrow(Carseats),200)
Carseats.test<-Carseats[-train,]
tree.carseats<-tree(High~.-Sales,Carseats,subset=train)
#To predict values related to decision treem we need to set type="class"
tree.pred<-predict(tree.carseats,Carseats.test,type="class")
High.test<-High[-train]
table(tree.pred,High.test)
## High.test
## tree.pred No Yes
## No 104 33
## Yes 13 50
mean(tree.pred==High.test)
## [1] 0.77
We will see if pruning the tree leads to better result:
set.seed(7)
#cv.tree( ) provides the cross-validation approach on whether the complexity of tree is optimized
#FUN=prune.misclass means we want classification error rate to be the guidance for our CV, instead of using the "default guidance", deviance
cv.carseats<-cv.tree(tree.carseats,FUN=prune.misclass)
names(cv.carseats)
## [1] "size" "dev" "k" "method"
cv.carseats
## $size
## [1] 21 19 14 9 8 5 3 2 1
##
## $dev
## [1] 75 75 75 74 82 83 83 85 82
##
## $k
## [1] -Inf 0.0 1.0 1.4 2.0 3.0 4.0 9.0 18.0
##
## $method
## [1] "misclass"
##
## attr(,"class")
## [1] "prune" "tree.sequence"
par(mfrow=c(1,2))
plot(cv.carseats$size,cv.carseats$dev,type="b")
plot(cv.carseats$k,cv.carseats$dev,type="b") # k here the tuning parameter alpha
#We can also prune the tree with the size we want
prune.carseats<-prune.misclass(tree.carseats,best=9) #here we want the result be 9 nodes
text(prune.carseats,pretty=0)
#Compare pruned tree and the original tree
tree.pred<-predict(prune.carseats,Carseats.test,type="class")
table(tree.pred,High.test)
## High.test
## tree.pred No Yes
## No 97 25
## Yes 20 58
mean(tree.pred==High.test)
## [1] 0.775
Bagging is also called Bootstrap Aggregation. Basically, it’s a way to improve the performance using decision trees by averaging the results of different models based on different datasets generated by bootstrap. Suppose we generate B of bootstrap samples,
for regression trees: \(\hat{f}_{bag}(x)=\frac{1}{B}\sum_{b=1}^B \hat{f}^{*b}(x)\). Here \(\hat{f}^{*b}(x)\)n denotes to be the result from \(b^{th}\) bootstrap data.
for classification trees: \(\hat{f}_{bag}(x)=\max_{k \in K} \hat{f}^{*b}(x)\), where \(K\) denotes to be the set of all responses.
attach(Boston)
## The following objects are masked from Boston (pos = 4):
##
## age, chas, crim, dis, indus, lstat, medv, nox, ptratio, rad, rm,
## tax, zn
#We use randomForest library for to perform both random forest and bagging
library(randomForest)
## randomForest 4.7-1.1
## Type rfNews() to see new features/changes/bug fixes.
set.seed(1)
train<-sample(1:nrow(Boston),nrow(Boston)/2)
#mtry=12 indicates that all 12 predictors are considered for each split of the tree
bag.boston<-randomForest(medv~.,data=Boston,subset=train,mtry=12,importance=TRUE)
bag.boston
##
## Call:
## randomForest(formula = medv ~ ., data = Boston, mtry = 12, importance = TRUE, subset = train)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 12
##
## Mean of squared residuals: 11.25779
## % Var explained: 85.35
#check the performance of this model
yhat.bag<-predict(bag.boston,newdata=Boston[-train,])
plot(yhat.bag,boston.test)
abline(0,1)
mean((yhat.bag-boston.test)^2)
## [1] 23.40359
#We change the number of trees grown using ntree=
bag.boston<-randomForest(medv~.,data=Boston,subset=train,mtry=12,ntree=25)
yhat.bag<-predict(bag.boston,newdata=Boston[-train,])
mean((yhat.bag-boston.test)^2)
## [1] 24.59162
importance(bag.boston)
## IncNodePurity
## crim 903.37902
## zn 85.01131
## indus 88.71500
## chas 21.84022
## nox 221.31936
## rm 11771.53150
## age 343.66607
## dis 267.37081
## rad 65.34431
## tax 138.86751
## ptratio 147.15701
## lstat 4735.66574
Question: how do we measure the variance from Bagging? In fact, besides Cross-Validation approach, there is a better way of doing this, although essentially speaking, they have the same logic.
Recall that the prediction variance should be estimated by data outside of the training dataset. What are the most accessible data out of the training dataset? Roughly speaking, each bootstrapped dataset will contain 2/3 of total observations in the original data set (this is an easy combinatorical problem). The following picture gives an intuition:
Alt text
Then There are 1/3 of observations out of the bootstrapped dataset, which is referred to as “OOB”. For each OOB corresponding to each bootstrapped dataset, we will have one model and for each observation point, that model has a prediction for that point, so there is a total of B/3 of prediction value for each of the n observation points, and we can calculate the overall (average) OOB value and its variance for each observation point.
Limit: However, bagging is not so good for interpretation, but anyway, it can provdie an overall description of the importance of each variables, which is more accurate than simple decision trees.
This is similar to Bagging, but solves one difficult the Bagging method is faced to. There might be some very strong predictors, and for each tree, the top splitting will always choose that predictors and make all the trees very similar, thus highly correlated. To avoid this problem, we have the Random Forests method. At each split, we randomly choose \(m=\sqrt{p}\) number of total p predictors as our candidates. In this way, we avoid to repeatedly using the same strong predictor and thus decorrelate the trees.
#In fact, in previous block we have already perform "semi-random forest". For the real RF, we typically use m=p^0.5, but here we use m=6
set.seed(1)
rf.boston<-randomForest(medv~.,data=Boston,subset=train,mtry=6,importance=TRUE)
yhat.rf<-predict(rf.boston,newdata=Boston[-train,])
mean((yhat.rf-boston.test)^2)
## [1] 20.06644
#Use importance( ) we can view the importance of each predictor
importance(rf.boston)
## %IncMSE IncNodePurity
## crim 19.435587 1070.42307
## zn 3.091630 82.19257
## indus 6.140529 590.09536
## chas 1.370310 36.70356
## nox 13.263466 859.97091
## rm 35.094741 8270.33906
## age 15.144821 634.31220
## dis 9.163776 684.87953
## rad 4.793720 83.18719
## tax 4.410714 292.20949
## ptratio 8.612780 902.20190
## lstat 28.725343 5813.04833
#For each variable, IncMSE shows the amount of increment in MSE if we change this particular predictor to another one; IncNodePurity shows the amount of change in node purity before before and after the splits over that predictor.
plot(rf.boston)
Boosting is similar to bagging, but the new trees are built based on the old trees. Basically speaking, instead of growing a tree based on the observations \(y\), we grow the new tree on the residual of the previous tree.
The Algorithm is designed below:
set \(\hat{f}(x)=0\) and \(r_i=y_i\) for all \(i\) in the training set.
For b=1,2,3,…,B, repeat:
Fit a tree \(\hat{f}^b\) with d splits (d+1 terminal nodes) to the training data (X,r), where X is the observation matrix, and r is the residual vector.
Update fitting model:\(\hat{f}(x) \leftarrow \hat{f}(x)+\lambda\hat{f}^b(x)\)
Update residuals: \(r_i \leftarrow r_i-\lambda\hat{f}^b(x)\)
\(\lambda\) here is the tuning parameter control the speed of the model’s learning
#We use gbm( ) in gbm library to perform Boosting
library(gbm)
## Loaded gbm 2.1.8.1
set.seed(1)
#distribution="Gaussian" indicates the response value is normally distributed
#n.trees denotes the total number of trees wanted to be generated
#interaction.depth gives an upper bound for the depth of each tree
boost.boston<-gbm(medv~.,data=Boston[train,],distribution="gaussian",n.trees=5000,interaction.depth=4)
#Notice the summary will directly give a plot
summary(boost.boston)
## var rel.inf
## rm rm 44.48249588
## lstat lstat 32.70281223
## crim crim 4.85109954
## dis dis 4.48693083
## nox nox 3.75222394
## age age 3.19769210
## ptratio ptratio 2.81354826
## tax tax 1.54417603
## indus indus 1.03384666
## rad rad 0.87625748
## zn zn 0.16220479
## chas chas 0.09671228
plot(boost.boston,i="rm")
plot(boost.boston,i="lstat")
yhat.boost<-predict(boost.boston,newdata=Boston[-train,],n.trees=5000)
mean((yhat.boost-boston.test)^2)
## [1] 18.39057
#We can change the tuning parameter lambda,using shrinkage
boost.boston<-gbm(medv~.,data=Boston[train,],distribution="gaussian",n.trees=5000,interaction.depth=4,shrinkage=0.2,verbose=T)
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 56.9086 nan 0.2000 20.1999
## 2 41.3377 nan 0.2000 12.0504
## 3 30.3945 nan 0.2000 7.3719
## 4 23.8964 nan 0.2000 6.1760
## 5 19.5818 nan 0.2000 4.3711
## 6 15.9997 nan 0.2000 2.8373
## 7 13.2539 nan 0.2000 2.1421
## 8 11.5738 nan 0.2000 1.0605
## 9 10.5059 nan 0.2000 0.4729
## 10 9.6491 nan 0.2000 0.4925
## 20 6.3183 nan 0.2000 -0.1510
## 40 3.9488 nan 0.2000 -0.0541
## 60 2.7830 nan 0.2000 -0.0903
## 80 2.0983 nan 0.2000 -0.0679
## 100 1.7134 nan 0.2000 -0.0872
## 120 1.3501 nan 0.2000 -0.0746
## 140 1.0779 nan 0.2000 -0.0375
## 160 0.9483 nan 0.2000 -0.0362
## 180 0.7822 nan 0.2000 -0.0384
## 200 0.6688 nan 0.2000 -0.0320
## 220 0.5736 nan 0.2000 -0.0241
## 240 0.4821 nan 0.2000 -0.0214
## 260 0.4193 nan 0.2000 -0.0147
## 280 0.3681 nan 0.2000 -0.0145
## 300 0.3203 nan 0.2000 -0.0149
## 320 0.2800 nan 0.2000 -0.0178
## 340 0.2391 nan 0.2000 -0.0070
## 360 0.2068 nan 0.2000 -0.0097
## 380 0.1818 nan 0.2000 -0.0086
## 400 0.1625 nan 0.2000 -0.0047
## 420 0.1419 nan 0.2000 -0.0029
## 440 0.1252 nan 0.2000 -0.0042
## 460 0.1102 nan 0.2000 -0.0025
## 480 0.0993 nan 0.2000 -0.0031
## 500 0.0885 nan 0.2000 -0.0027
## 520 0.0781 nan 0.2000 -0.0007
## 540 0.0700 nan 0.2000 -0.0022
## 560 0.0622 nan 0.2000 -0.0012
## 580 0.0561 nan 0.2000 -0.0017
## 600 0.0506 nan 0.2000 -0.0015
## 620 0.0458 nan 0.2000 -0.0014
## 640 0.0419 nan 0.2000 -0.0018
## 660 0.0381 nan 0.2000 -0.0015
## 680 0.0343 nan 0.2000 -0.0014
## 700 0.0307 nan 0.2000 -0.0009
## 720 0.0276 nan 0.2000 -0.0008
## 740 0.0253 nan 0.2000 -0.0005
## 760 0.0233 nan 0.2000 -0.0011
## 780 0.0213 nan 0.2000 -0.0007
## 800 0.0190 nan 0.2000 -0.0005
## 820 0.0171 nan 0.2000 -0.0002
## 840 0.0158 nan 0.2000 -0.0010
## 860 0.0144 nan 0.2000 -0.0001
## 880 0.0132 nan 0.2000 -0.0007
## 900 0.0120 nan 0.2000 -0.0003
## 920 0.0109 nan 0.2000 -0.0003
## 940 0.0097 nan 0.2000 -0.0004
## 960 0.0089 nan 0.2000 -0.0001
## 980 0.0082 nan 0.2000 -0.0002
## 1000 0.0074 nan 0.2000 -0.0002
## 1020 0.0068 nan 0.2000 -0.0003
## 1040 0.0062 nan 0.2000 -0.0004
## 1060 0.0058 nan 0.2000 -0.0002
## 1080 0.0053 nan 0.2000 -0.0002
## 1100 0.0049 nan 0.2000 -0.0002
## 1120 0.0043 nan 0.2000 -0.0002
## 1140 0.0039 nan 0.2000 -0.0001
## 1160 0.0035 nan 0.2000 -0.0001
## 1180 0.0032 nan 0.2000 -0.0001
## 1200 0.0030 nan 0.2000 -0.0002
## 1220 0.0028 nan 0.2000 -0.0001
## 1240 0.0025 nan 0.2000 -0.0001
## 1260 0.0023 nan 0.2000 -0.0001
## 1280 0.0022 nan 0.2000 -0.0000
## 1300 0.0020 nan 0.2000 -0.0000
## 1320 0.0019 nan 0.2000 -0.0001
## 1340 0.0017 nan 0.2000 -0.0000
## 1360 0.0015 nan 0.2000 -0.0000
## 1380 0.0014 nan 0.2000 -0.0000
## 1400 0.0013 nan 0.2000 -0.0001
## 1420 0.0012 nan 0.2000 -0.0000
## 1440 0.0011 nan 0.2000 -0.0000
## 1460 0.0010 nan 0.2000 -0.0000
## 1480 0.0009 nan 0.2000 -0.0000
## 1500 0.0009 nan 0.2000 -0.0001
## 1520 0.0008 nan 0.2000 -0.0000
## 1540 0.0008 nan 0.2000 -0.0000
## 1560 0.0007 nan 0.2000 -0.0000
## 1580 0.0006 nan 0.2000 -0.0000
## 1600 0.0006 nan 0.2000 -0.0000
## 1620 0.0005 nan 0.2000 -0.0000
## 1640 0.0005 nan 0.2000 -0.0000
## 1660 0.0005 nan 0.2000 -0.0000
## 1680 0.0004 nan 0.2000 -0.0000
## 1700 0.0004 nan 0.2000 -0.0000
## 1720 0.0004 nan 0.2000 -0.0000
## 1740 0.0003 nan 0.2000 -0.0000
## 1760 0.0003 nan 0.2000 -0.0000
## 1780 0.0003 nan 0.2000 -0.0000
## 1800 0.0003 nan 0.2000 -0.0000
## 1820 0.0003 nan 0.2000 -0.0000
## 1840 0.0002 nan 0.2000 -0.0000
## 1860 0.0002 nan 0.2000 -0.0000
## 1880 0.0002 nan 0.2000 -0.0000
## 1900 0.0002 nan 0.2000 -0.0000
## 1920 0.0002 nan 0.2000 -0.0000
## 1940 0.0002 nan 0.2000 -0.0000
## 1960 0.0001 nan 0.2000 -0.0000
## 1980 0.0001 nan 0.2000 -0.0000
## 2000 0.0001 nan 0.2000 -0.0000
## 2020 0.0001 nan 0.2000 -0.0000
## 2040 0.0001 nan 0.2000 -0.0000
## 2060 0.0001 nan 0.2000 -0.0000
## 2080 0.0001 nan 0.2000 -0.0000
## 2100 0.0001 nan 0.2000 -0.0000
## 2120 0.0001 nan 0.2000 -0.0000
## 2140 0.0001 nan 0.2000 -0.0000
## 2160 0.0001 nan 0.2000 -0.0000
## 2180 0.0001 nan 0.2000 -0.0000
## 2200 0.0001 nan 0.2000 -0.0000
## 2220 0.0000 nan 0.2000 -0.0000
## 2240 0.0000 nan 0.2000 -0.0000
## 2260 0.0000 nan 0.2000 -0.0000
## 2280 0.0000 nan 0.2000 -0.0000
## 2300 0.0000 nan 0.2000 -0.0000
## 2320 0.0000 nan 0.2000 -0.0000
## 2340 0.0000 nan 0.2000 -0.0000
## 2360 0.0000 nan 0.2000 -0.0000
## 2380 0.0000 nan 0.2000 -0.0000
## 2400 0.0000 nan 0.2000 -0.0000
## 2420 0.0000 nan 0.2000 -0.0000
## 2440 0.0000 nan 0.2000 0.0000
## 2460 0.0000 nan 0.2000 -0.0000
## 2480 0.0000 nan 0.2000 -0.0000
## 2500 0.0000 nan 0.2000 -0.0000
## 2520 0.0000 nan 0.2000 -0.0000
## 2540 0.0000 nan 0.2000 -0.0000
## 2560 0.0000 nan 0.2000 -0.0000
## 2580 0.0000 nan 0.2000 -0.0000
## 2600 0.0000 nan 0.2000 -0.0000
## 2620 0.0000 nan 0.2000 -0.0000
## 2640 0.0000 nan 0.2000 -0.0000
## 2660 0.0000 nan 0.2000 -0.0000
## 2680 0.0000 nan 0.2000 -0.0000
## 2700 0.0000 nan 0.2000 -0.0000
## 2720 0.0000 nan 0.2000 -0.0000
## 2740 0.0000 nan 0.2000 -0.0000
## 2760 0.0000 nan 0.2000 -0.0000
## 2780 0.0000 nan 0.2000 -0.0000
## 2800 0.0000 nan 0.2000 -0.0000
## 2820 0.0000 nan 0.2000 -0.0000
## 2840 0.0000 nan 0.2000 -0.0000
## 2860 0.0000 nan 0.2000 -0.0000
## 2880 0.0000 nan 0.2000 -0.0000
## 2900 0.0000 nan 0.2000 -0.0000
## 2920 0.0000 nan 0.2000 -0.0000
## 2940 0.0000 nan 0.2000 -0.0000
## 2960 0.0000 nan 0.2000 -0.0000
## 2980 0.0000 nan 0.2000 -0.0000
## 3000 0.0000 nan 0.2000 -0.0000
## 3020 0.0000 nan 0.2000 -0.0000
## 3040 0.0000 nan 0.2000 -0.0000
## 3060 0.0000 nan 0.2000 -0.0000
## 3080 0.0000 nan 0.2000 -0.0000
## 3100 0.0000 nan 0.2000 -0.0000
## 3120 0.0000 nan 0.2000 -0.0000
## 3140 0.0000 nan 0.2000 -0.0000
## 3160 0.0000 nan 0.2000 -0.0000
## 3180 0.0000 nan 0.2000 -0.0000
## 3200 0.0000 nan 0.2000 -0.0000
## 3220 0.0000 nan 0.2000 -0.0000
## 3240 0.0000 nan 0.2000 -0.0000
## 3260 0.0000 nan 0.2000 -0.0000
## 3280 0.0000 nan 0.2000 -0.0000
## 3300 0.0000 nan 0.2000 -0.0000
## 3320 0.0000 nan 0.2000 -0.0000
## 3340 0.0000 nan 0.2000 -0.0000
## 3360 0.0000 nan 0.2000 -0.0000
## 3380 0.0000 nan 0.2000 -0.0000
## 3400 0.0000 nan 0.2000 -0.0000
## 3420 0.0000 nan 0.2000 -0.0000
## 3440 0.0000 nan 0.2000 -0.0000
## 3460 0.0000 nan 0.2000 0.0000
## 3480 0.0000 nan 0.2000 -0.0000
## 3500 0.0000 nan 0.2000 -0.0000
## 3520 0.0000 nan 0.2000 -0.0000
## 3540 0.0000 nan 0.2000 -0.0000
## 3560 0.0000 nan 0.2000 -0.0000
## 3580 0.0000 nan 0.2000 -0.0000
## 3600 0.0000 nan 0.2000 -0.0000
## 3620 0.0000 nan 0.2000 -0.0000
## 3640 0.0000 nan 0.2000 -0.0000
## 3660 0.0000 nan 0.2000 -0.0000
## 3680 0.0000 nan 0.2000 -0.0000
## 3700 0.0000 nan 0.2000 -0.0000
## 3720 0.0000 nan 0.2000 -0.0000
## 3740 0.0000 nan 0.2000 -0.0000
## 3760 0.0000 nan 0.2000 -0.0000
## 3780 0.0000 nan 0.2000 -0.0000
## 3800 0.0000 nan 0.2000 -0.0000
## 3820 0.0000 nan 0.2000 -0.0000
## 3840 0.0000 nan 0.2000 -0.0000
## 3860 0.0000 nan 0.2000 -0.0000
## 3880 0.0000 nan 0.2000 -0.0000
## 3900 0.0000 nan 0.2000 -0.0000
## 3920 0.0000 nan 0.2000 -0.0000
## 3940 0.0000 nan 0.2000 -0.0000
## 3960 0.0000 nan 0.2000 -0.0000
## 3980 0.0000 nan 0.2000 -0.0000
## 4000 0.0000 nan 0.2000 -0.0000
## 4020 0.0000 nan 0.2000 -0.0000
## 4040 0.0000 nan 0.2000 -0.0000
## 4060 0.0000 nan 0.2000 -0.0000
## 4080 0.0000 nan 0.2000 -0.0000
## 4100 0.0000 nan 0.2000 -0.0000
## 4120 0.0000 nan 0.2000 -0.0000
## 4140 0.0000 nan 0.2000 -0.0000
## 4160 0.0000 nan 0.2000 -0.0000
## 4180 0.0000 nan 0.2000 -0.0000
## 4200 0.0000 nan 0.2000 -0.0000
## 4220 0.0000 nan 0.2000 -0.0000
## 4240 0.0000 nan 0.2000 -0.0000
## 4260 0.0000 nan 0.2000 -0.0000
## 4280 0.0000 nan 0.2000 -0.0000
## 4300 0.0000 nan 0.2000 0.0000
## 4320 0.0000 nan 0.2000 -0.0000
## 4340 0.0000 nan 0.2000 -0.0000
## 4360 0.0000 nan 0.2000 -0.0000
## 4380 0.0000 nan 0.2000 -0.0000
## 4400 0.0000 nan 0.2000 -0.0000
## 4420 0.0000 nan 0.2000 -0.0000
## 4440 0.0000 nan 0.2000 -0.0000
## 4460 0.0000 nan 0.2000 -0.0000
## 4480 0.0000 nan 0.2000 -0.0000
## 4500 0.0000 nan 0.2000 -0.0000
## 4520 0.0000 nan 0.2000 -0.0000
## 4540 0.0000 nan 0.2000 -0.0000
## 4560 0.0000 nan 0.2000 -0.0000
## 4580 0.0000 nan 0.2000 -0.0000
## 4600 0.0000 nan 0.2000 -0.0000
## 4620 0.0000 nan 0.2000 -0.0000
## 4640 0.0000 nan 0.2000 -0.0000
## 4660 0.0000 nan 0.2000 -0.0000
## 4680 0.0000 nan 0.2000 -0.0000
## 4700 0.0000 nan 0.2000 -0.0000
## 4720 0.0000 nan 0.2000 -0.0000
## 4740 0.0000 nan 0.2000 -0.0000
## 4760 0.0000 nan 0.2000 -0.0000
## 4780 0.0000 nan 0.2000 -0.0000
## 4800 0.0000 nan 0.2000 -0.0000
## 4820 0.0000 nan 0.2000 -0.0000
## 4840 0.0000 nan 0.2000 -0.0000
## 4860 0.0000 nan 0.2000 -0.0000
## 4880 0.0000 nan 0.2000 -0.0000
## 4900 0.0000 nan 0.2000 -0.0000
## 4920 0.0000 nan 0.2000 -0.0000
## 4940 0.0000 nan 0.2000 -0.0000
## 4960 0.0000 nan 0.2000 -0.0000
## 4980 0.0000 nan 0.2000 -0.0000
## 5000 0.0000 nan 0.2000 -0.0000
summary(boost.boston)
## var rel.inf
## rm rm 40.6572703
## lstat lstat 33.3357616
## dis dis 7.7631529
## crim crim 4.7836616
## age age 3.7952431
## nox nox 3.1407319
## ptratio ptratio 2.2634352
## tax tax 1.7963332
## indus indus 1.1616263
## rad rad 0.8681043
## chas chas 0.2425151
## zn zn 0.1921646
boost.boston<-gbm(medv~.,data=Boston[train,],distribution="gaussian",n.trees=5000,interaction.depth=4,shrinkage=0.2,verbose=T)
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 56.2691 nan 0.2000 19.5263
## 2 42.3278 nan 0.2000 15.3491
## 3 31.6825 nan 0.2000 8.7088
## 4 24.0718 nan 0.2000 7.8785
## 5 19.4378 nan 0.2000 2.7405
## 6 15.9465 nan 0.2000 2.8668
## 7 13.4260 nan 0.2000 2.4684
## 8 11.9516 nan 0.2000 1.4956
## 9 10.0949 nan 0.2000 0.8800
## 10 9.1483 nan 0.2000 0.6951
## 20 5.7796 nan 0.2000 -0.0118
## 40 3.7299 nan 0.2000 -0.1659
## 60 2.8026 nan 0.2000 -0.1141
## 80 2.2100 nan 0.2000 -0.1383
## 100 1.7385 nan 0.2000 -0.0326
## 120 1.3991 nan 0.2000 -0.0393
## 140 1.1734 nan 0.2000 -0.0347
## 160 0.9784 nan 0.2000 -0.0340
## 180 0.8512 nan 0.2000 -0.0283
## 200 0.7156 nan 0.2000 -0.0218
## 220 0.6540 nan 0.2000 -0.0524
## 240 0.5504 nan 0.2000 -0.0155
## 260 0.4854 nan 0.2000 -0.0223
## 280 0.4251 nan 0.2000 -0.0187
## 300 0.3649 nan 0.2000 -0.0109
## 320 0.3217 nan 0.2000 -0.0140
## 340 0.2865 nan 0.2000 -0.0067
## 360 0.2536 nan 0.2000 -0.0170
## 380 0.2242 nan 0.2000 -0.0086
## 400 0.2024 nan 0.2000 -0.0059
## 420 0.1777 nan 0.2000 -0.0054
## 440 0.1591 nan 0.2000 -0.0022
## 460 0.1435 nan 0.2000 -0.0027
## 480 0.1315 nan 0.2000 -0.0056
## 500 0.1159 nan 0.2000 -0.0034
## 520 0.1012 nan 0.2000 -0.0028
## 540 0.0905 nan 0.2000 -0.0042
## 560 0.0823 nan 0.2000 -0.0022
## 580 0.0724 nan 0.2000 -0.0022
## 600 0.0667 nan 0.2000 -0.0037
## 620 0.0593 nan 0.2000 -0.0019
## 640 0.0551 nan 0.2000 -0.0024
## 660 0.0503 nan 0.2000 -0.0015
## 680 0.0455 nan 0.2000 -0.0012
## 700 0.0412 nan 0.2000 -0.0016
## 720 0.0374 nan 0.2000 -0.0014
## 740 0.0343 nan 0.2000 -0.0007
## 760 0.0312 nan 0.2000 -0.0007
## 780 0.0284 nan 0.2000 -0.0006
## 800 0.0259 nan 0.2000 -0.0011
## 820 0.0231 nan 0.2000 -0.0003
## 840 0.0209 nan 0.2000 -0.0009
## 860 0.0189 nan 0.2000 -0.0006
## 880 0.0175 nan 0.2000 -0.0004
## 900 0.0158 nan 0.2000 -0.0008
## 920 0.0139 nan 0.2000 -0.0005
## 940 0.0125 nan 0.2000 -0.0003
## 960 0.0114 nan 0.2000 -0.0003
## 980 0.0106 nan 0.2000 -0.0002
## 1000 0.0098 nan 0.2000 -0.0004
## 1020 0.0090 nan 0.2000 -0.0005
## 1040 0.0085 nan 0.2000 -0.0002
## 1060 0.0076 nan 0.2000 -0.0004
## 1080 0.0070 nan 0.2000 -0.0003
## 1100 0.0064 nan 0.2000 -0.0002
## 1120 0.0058 nan 0.2000 -0.0000
## 1140 0.0053 nan 0.2000 -0.0001
## 1160 0.0047 nan 0.2000 -0.0002
## 1180 0.0044 nan 0.2000 -0.0002
## 1200 0.0041 nan 0.2000 -0.0002
## 1220 0.0037 nan 0.2000 -0.0002
## 1240 0.0033 nan 0.2000 -0.0000
## 1260 0.0029 nan 0.2000 -0.0002
## 1280 0.0026 nan 0.2000 -0.0001
## 1300 0.0024 nan 0.2000 -0.0001
## 1320 0.0022 nan 0.2000 -0.0001
## 1340 0.0021 nan 0.2000 -0.0001
## 1360 0.0019 nan 0.2000 -0.0001
## 1380 0.0017 nan 0.2000 -0.0001
## 1400 0.0016 nan 0.2000 -0.0000
## 1420 0.0015 nan 0.2000 -0.0000
## 1440 0.0013 nan 0.2000 -0.0000
## 1460 0.0012 nan 0.2000 -0.0000
## 1480 0.0011 nan 0.2000 -0.0000
## 1500 0.0010 nan 0.2000 -0.0000
## 1520 0.0010 nan 0.2000 -0.0000
## 1540 0.0009 nan 0.2000 -0.0000
## 1560 0.0008 nan 0.2000 -0.0000
## 1580 0.0008 nan 0.2000 -0.0000
## 1600 0.0007 nan 0.2000 -0.0000
## 1620 0.0006 nan 0.2000 -0.0000
## 1640 0.0006 nan 0.2000 -0.0000
## 1660 0.0006 nan 0.2000 -0.0000
## 1680 0.0005 nan 0.2000 -0.0000
## 1700 0.0005 nan 0.2000 -0.0000
## 1720 0.0004 nan 0.2000 -0.0000
## 1740 0.0004 nan 0.2000 -0.0000
## 1760 0.0004 nan 0.2000 -0.0000
## 1780 0.0004 nan 0.2000 -0.0000
## 1800 0.0003 nan 0.2000 -0.0000
## 1820 0.0003 nan 0.2000 -0.0000
## 1840 0.0003 nan 0.2000 -0.0000
## 1860 0.0003 nan 0.2000 -0.0000
## 1880 0.0002 nan 0.2000 -0.0000
## 1900 0.0002 nan 0.2000 -0.0000
## 1920 0.0002 nan 0.2000 -0.0000
## 1940 0.0002 nan 0.2000 -0.0000
## 1960 0.0002 nan 0.2000 -0.0000
## 1980 0.0002 nan 0.2000 -0.0000
## 2000 0.0001 nan 0.2000 -0.0000
## 2020 0.0001 nan 0.2000 -0.0000
## 2040 0.0001 nan 0.2000 -0.0000
## 2060 0.0001 nan 0.2000 -0.0000
## 2080 0.0001 nan 0.2000 -0.0000
## 2100 0.0001 nan 0.2000 -0.0000
## 2120 0.0001 nan 0.2000 -0.0000
## 2140 0.0001 nan 0.2000 -0.0000
## 2160 0.0001 nan 0.2000 -0.0000
## 2180 0.0001 nan 0.2000 -0.0000
## 2200 0.0001 nan 0.2000 -0.0000
## 2220 0.0001 nan 0.2000 -0.0000
## 2240 0.0001 nan 0.2000 -0.0000
## 2260 0.0001 nan 0.2000 -0.0000
## 2280 0.0001 nan 0.2000 -0.0000
## 2300 0.0001 nan 0.2000 -0.0000
## 2320 0.0000 nan 0.2000 -0.0000
## 2340 0.0000 nan 0.2000 -0.0000
## 2360 0.0000 nan 0.2000 -0.0000
## 2380 0.0000 nan 0.2000 -0.0000
## 2400 0.0000 nan 0.2000 -0.0000
## 2420 0.0000 nan 0.2000 -0.0000
## 2440 0.0000 nan 0.2000 -0.0000
## 2460 0.0000 nan 0.2000 -0.0000
## 2480 0.0000 nan 0.2000 -0.0000
## 2500 0.0000 nan 0.2000 -0.0000
## 2520 0.0000 nan 0.2000 -0.0000
## 2540 0.0000 nan 0.2000 -0.0000
## 2560 0.0000 nan 0.2000 -0.0000
## 2580 0.0000 nan 0.2000 -0.0000
## 2600 0.0000 nan 0.2000 -0.0000
## 2620 0.0000 nan 0.2000 -0.0000
## 2640 0.0000 nan 0.2000 -0.0000
## 2660 0.0000 nan 0.2000 -0.0000
## 2680 0.0000 nan 0.2000 -0.0000
## 2700 0.0000 nan 0.2000 -0.0000
## 2720 0.0000 nan 0.2000 -0.0000
## 2740 0.0000 nan 0.2000 -0.0000
## 2760 0.0000 nan 0.2000 -0.0000
## 2780 0.0000 nan 0.2000 -0.0000
## 2800 0.0000 nan 0.2000 -0.0000
## 2820 0.0000 nan 0.2000 -0.0000
## 2840 0.0000 nan 0.2000 -0.0000
## 2860 0.0000 nan 0.2000 -0.0000
## 2880 0.0000 nan 0.2000 -0.0000
## 2900 0.0000 nan 0.2000 -0.0000
## 2920 0.0000 nan 0.2000 -0.0000
## 2940 0.0000 nan 0.2000 -0.0000
## 2960 0.0000 nan 0.2000 -0.0000
## 2980 0.0000 nan 0.2000 -0.0000
## 3000 0.0000 nan 0.2000 -0.0000
## 3020 0.0000 nan 0.2000 -0.0000
## 3040 0.0000 nan 0.2000 -0.0000
## 3060 0.0000 nan 0.2000 -0.0000
## 3080 0.0000 nan 0.2000 -0.0000
## 3100 0.0000 nan 0.2000 -0.0000
## 3120 0.0000 nan 0.2000 -0.0000
## 3140 0.0000 nan 0.2000 -0.0000
## 3160 0.0000 nan 0.2000 -0.0000
## 3180 0.0000 nan 0.2000 -0.0000
## 3200 0.0000 nan 0.2000 -0.0000
## 3220 0.0000 nan 0.2000 -0.0000
## 3240 0.0000 nan 0.2000 -0.0000
## 3260 0.0000 nan 0.2000 -0.0000
## 3280 0.0000 nan 0.2000 -0.0000
## 3300 0.0000 nan 0.2000 -0.0000
## 3320 0.0000 nan 0.2000 -0.0000
## 3340 0.0000 nan 0.2000 -0.0000
## 3360 0.0000 nan 0.2000 -0.0000
## 3380 0.0000 nan 0.2000 -0.0000
## 3400 0.0000 nan 0.2000 -0.0000
## 3420 0.0000 nan 0.2000 -0.0000
## 3440 0.0000 nan 0.2000 -0.0000
## 3460 0.0000 nan 0.2000 -0.0000
## 3480 0.0000 nan 0.2000 -0.0000
## 3500 0.0000 nan 0.2000 -0.0000
## 3520 0.0000 nan 0.2000 -0.0000
## 3540 0.0000 nan 0.2000 -0.0000
## 3560 0.0000 nan 0.2000 -0.0000
## 3580 0.0000 nan 0.2000 -0.0000
## 3600 0.0000 nan 0.2000 -0.0000
## 3620 0.0000 nan 0.2000 -0.0000
## 3640 0.0000 nan 0.2000 -0.0000
## 3660 0.0000 nan 0.2000 -0.0000
## 3680 0.0000 nan 0.2000 -0.0000
## 3700 0.0000 nan 0.2000 -0.0000
## 3720 0.0000 nan 0.2000 -0.0000
## 3740 0.0000 nan 0.2000 -0.0000
## 3760 0.0000 nan 0.2000 -0.0000
## 3780 0.0000 nan 0.2000 -0.0000
## 3800 0.0000 nan 0.2000 -0.0000
## 3820 0.0000 nan 0.2000 -0.0000
## 3840 0.0000 nan 0.2000 -0.0000
## 3860 0.0000 nan 0.2000 -0.0000
## 3880 0.0000 nan 0.2000 -0.0000
## 3900 0.0000 nan 0.2000 -0.0000
## 3920 0.0000 nan 0.2000 -0.0000
## 3940 0.0000 nan 0.2000 -0.0000
## 3960 0.0000 nan 0.2000 -0.0000
## 3980 0.0000 nan 0.2000 -0.0000
## 4000 0.0000 nan 0.2000 -0.0000
## 4020 0.0000 nan 0.2000 -0.0000
## 4040 0.0000 nan 0.2000 -0.0000
## 4060 0.0000 nan 0.2000 -0.0000
## 4080 0.0000 nan 0.2000 -0.0000
## 4100 0.0000 nan 0.2000 -0.0000
## 4120 0.0000 nan 0.2000 -0.0000
## 4140 0.0000 nan 0.2000 0.0000
## 4160 0.0000 nan 0.2000 -0.0000
## 4180 0.0000 nan 0.2000 -0.0000
## 4200 0.0000 nan 0.2000 -0.0000
## 4220 0.0000 nan 0.2000 -0.0000
## 4240 0.0000 nan 0.2000 -0.0000
## 4260 0.0000 nan 0.2000 -0.0000
## 4280 0.0000 nan 0.2000 -0.0000
## 4300 0.0000 nan 0.2000 -0.0000
## 4320 0.0000 nan 0.2000 -0.0000
## 4340 0.0000 nan 0.2000 -0.0000
## 4360 0.0000 nan 0.2000 -0.0000
## 4380 0.0000 nan 0.2000 -0.0000
## 4400 0.0000 nan 0.2000 -0.0000
## 4420 0.0000 nan 0.2000 -0.0000
## 4440 0.0000 nan 0.2000 -0.0000
## 4460 0.0000 nan 0.2000 -0.0000
## 4480 0.0000 nan 0.2000 -0.0000
## 4500 0.0000 nan 0.2000 -0.0000
## 4520 0.0000 nan 0.2000 -0.0000
## 4540 0.0000 nan 0.2000 -0.0000
## 4560 0.0000 nan 0.2000 -0.0000
## 4580 0.0000 nan 0.2000 -0.0000
## 4600 0.0000 nan 0.2000 -0.0000
## 4620 0.0000 nan 0.2000 -0.0000
## 4640 0.0000 nan 0.2000 -0.0000
## 4660 0.0000 nan 0.2000 -0.0000
## 4680 0.0000 nan 0.2000 -0.0000
## 4700 0.0000 nan 0.2000 -0.0000
## 4720 0.0000 nan 0.2000 -0.0000
## 4740 0.0000 nan 0.2000 -0.0000
## 4760 0.0000 nan 0.2000 -0.0000
## 4780 0.0000 nan 0.2000 -0.0000
## 4800 0.0000 nan 0.2000 -0.0000
## 4820 0.0000 nan 0.2000 -0.0000
## 4840 0.0000 nan 0.2000 -0.0000
## 4860 0.0000 nan 0.2000 -0.0000
## 4880 0.0000 nan 0.2000 -0.0000
## 4900 0.0000 nan 0.2000 -0.0000
## 4920 0.0000 nan 0.2000 -0.0000
## 4940 0.0000 nan 0.2000 -0.0000
## 4960 0.0000 nan 0.2000 -0.0000
## 4980 0.0000 nan 0.2000 -0.0000
## 5000 0.0000 nan 0.2000 -0.0000
summary(boost.boston)
## var rel.inf
## rm rm 46.51875040
## lstat lstat 26.80359871
## crim crim 6.10414858
## nox nox 5.13973542
## dis dis 5.04655131
## age age 3.22429018
## ptratio ptratio 2.50610989
## tax tax 2.34618897
## indus indus 1.28463718
## rad rad 0.90313172
## zn zn 0.11110023
## chas chas 0.01175742
yhat.boost<-predict(boost.boston,newdata=Boston[-train,],n.trees=5000)
#By demonstration, verbose=T will record and print the whole progress for the whole procedure.
mean((yhat.boost-boston.test)^2)
## [1] 18.57148
BART can be viewed as the combination of bagging and boosting. In general, we have B turns of iterations. In each iteration, we grow K number of new trees based on the trees grown in the previous iterations. Thus, the iteration can be viewed as boosting but within each iteration, it can be viewed as bagging. The following displays the algorithm: (note thta \(\hat{f}^b_k(x)\) here denotes to be the predicted value for \(k^{th}\) tree in \(b^{th}\) iteration)
Let \(\hat{f}^1_1(x) = \hat{f}^1_2(x) = \dots = \hat{f}^1_K(x) = \frac{1}{nK} \sum_{i=1}^n y_i\)
Compute \(f^1(x)=\sum_{k=1}^K \hat{f}_k^1 (x)=\frac{1}{n}\sum_{i=1}^n y_i\)
For b=2,…, B:
(a1) For i=1,…,n, compute the current partial residual: \(r_i=y_i-\sum_{k'<k} \hat{f}_{k'}^{b}(x_i)-\sum_{k'>k}\hat{f}_{k'}^{b-1}(x_i)\)
(a2) Fit a new tree, \(\hat{f}_k^b(x)\), to \(r_i\), by randomly perturbing the kth tree from the previous iteration, \(\hat{f}_k^{b-1}(x)\). Pertubations that improve the fit are favored. (The detailed choice for perturbation is described as the following picture)
Alt text
library(BART)
## Loading required package: nlme
## Loading required package: nnet
## Loading required package: survival
x<-Boston[,1:12]
y<-Boston[,"medv"]
xtrain<-x[train,]
ytrain<-y[train]
xtest<-x[-train,]
ytest<-y[-train]
set.seed(1)
bartfit<-gbart(xtrain,ytrain,x.test=xtest)
## *****Calling gbart: type=1
## *****Data:
## data:n,p,np: 253, 12, 253
## y1,yn: 0.213439, -5.486561
## x1,x[n*p]: 0.109590, 20.080000
## xp1,xp[np*p]: 0.027310, 7.880000
## *****Number of Trees: 200
## *****Number of Cut Points: 100 ... 100
## *****burn,nd,thin: 100,1000,1
## *****Prior:beta,alpha,tau,nu,lambda,offset: 2,0.95,0.795495,3,3.71636,21.7866
## *****sigma: 4.367914
## *****w (weights): 1.000000 ... 1.000000
## *****Dirichlet:sparse,theta,omega,a,b,rho,augment: 0,0,1,0.5,1,12,0
## *****printevery: 100
##
## MCMC
## done 0 (out of 1100)
## done 100 (out of 1100)
## done 200 (out of 1100)
## done 300 (out of 1100)
## done 400 (out of 1100)
## done 500 (out of 1100)
## done 600 (out of 1100)
## done 700 (out of 1100)
## done 800 (out of 1100)
## done 900 (out of 1100)
## done 1000 (out of 1100)
## time: 2s
## trcnt,tecnt: 1000,1000
summary(bartfit)
## Length Class Mode
## sigma 1100 -none- numeric
## yhat.train 253000 -none- numeric
## yhat.test 253000 -none- numeric
## varcount 12000 -none- numeric
## varprob 12000 -none- numeric
## treedraws 2 -none- list
## proc.time 5 proc_time numeric
## hostname 1 -none- logical
## yhat.train.mean 253 -none- numeric
## sigma.mean 1 -none- numeric
## LPML 1 -none- numeric
## yhat.test.mean 253 -none- numeric
## ndpost 1 -none- numeric
## offset 1 -none- numeric
## varcount.mean 12 -none- numeric
## varprob.mean 12 -none- numeric
## rm.const 12 -none- numeric
bartfit$varcount.mean
## crim zn indus chas nox rm age dis rad tax
## 11.607 15.576 19.073 19.283 22.973 20.725 19.278 13.800 21.638 20.021
## ptratio lstat
## 19.615 21.653
yhat.bart<-bartfit$yhat.test.mean
mean((ytest-yhat.bart)^2) #We can see the MSE is lower here
## [1] 15.91912
#check how many times each variable appeared
ord<-order(bartfit$varcount.mean,decreasing=T)
ord
## [1] 5 12 9 6 10 11 4 7 3 2 8 1
bartfit$varcount.mean[ord]
## nox lstat rad rm tax ptratio chas age indus zn
## 22.973 21.653 21.638 20.725 20.021 19.615 19.283 19.278 19.073 15.576
## dis crim
## 13.800 11.607